-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DX-24413, DX-24414 : Gandiva cache peformance improvements #15
base: rel-460
Are you sure you want to change the base?
Conversation
projjal
commented
Aug 5, 2020
•
edited
Loading
edited
- Prewarm the gandiva cache on gandiva startup time
- Make the cache size configurable
@@ -42,8 +49,24 @@ class Cache { | |||
} | |||
|
|||
private: | |||
static int GetCapacity() { | |||
int capacity; | |||
const char* env_cache_size = std::getenv("GANDIVA_CACHE_SIZE"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cant we instead pass this as a configuration (that is in turn passed from a support option) ? that way we can support other customers too..
https://github.com/apache/arrow/blob/master/cpp/src/gandiva/configuration.h
std::ifstream fin; | ||
fin.open(dir_iter->path().string(), std::ios::binary); | ||
if (!fin.is_open()) { | ||
std::cerr << "[Gandiva Cache Prewarm] Failure opening warmup cache file" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe also mention which file..
|
||
int32_t schema_len; | ||
if (remaining < sizeof(schema_len)) { | ||
std::cerr << "[Gandiva Cache Prewarm] Invalid file." << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
across the board - file name along with errors.
@@ -117,6 +128,9 @@ jint JNI_OnLoad(JavaVM* vm, void* reserved) { | |||
env->GetFieldID(vector_expander_ret_class_, "address", "J"); | |||
vector_expander_ret_capacity_ = | |||
env->GetFieldID(vector_expander_ret_class_, "capacity", "J"); | |||
|
|||
PrewarmCache(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this going to block all queries until the cache is warmed? should we do this in the background instead for e.g. if we have a 100 expression list this is going to take upto a minute to prepare the cache during which no query can progress
} | ||
|
||
fs::path path(env_path); | ||
if (!fs::is_directory(path) && !fs::create_directories(path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this thread safe? two threads trying to create the directory at the same time
@@ -660,6 +861,13 @@ JNIEXPORT jlong JNICALL Java_org_apache_arrow_gandiva_evaluator_JniWrapper_build | |||
holder = std::shared_ptr<ProjectorHolder>( | |||
new ProjectorHolder(schema_ptr, ret_types, std::move(projector))); | |||
module_id = projector_modules_.Insert(holder); | |||
|
|||
if (!cache_hit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so is the expectation, apple will get files from all executors? wont there be duplicates in that case..
also is the expectation that cache size be at-least as big to accommodate all expressions needed?
are new queries expected in this cluster - if yes and new a query comes in now we might evict one of these expressions. is that ok
…ache#41152) ### Rationale for this change An error is received installing R duckdb: ``` dremio#15 18.13 > remotes::install_github('duckdb/duckdb-r', build = FALSE) dremio#15 18.27 Error: Failed to install 'unknown package' from **GitHub:** dremio#15 18.27 Line starting 'Roxyg ...' is malformed! ``` Some searching seems to suggest that this is because R cannot process UTF-8 characters in DESCRIPTION files if the `LANG` is set to `C`. ### What changes are included in this PR? The `LANG` is set to `C.UTF-8` in the dockerfile for this CI job ### Are these changes tested? The change only affects a test ### Are there any user-facing changes? No * GitHub Issue: apache#41145 Authored-by: Weston Pace <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
…ache#41152) ### Rationale for this change An error is received installing R duckdb: ``` dremio#15 18.13 > remotes::install_github('duckdb/duckdb-r', build = FALSE) dremio#15 18.27 Error: Failed to install 'unknown package' from **GitHub:** dremio#15 18.27 Line starting 'Roxyg ...' is malformed! ``` Some searching seems to suggest that this is because R cannot process UTF-8 characters in DESCRIPTION files if the `LANG` is set to `C`. ### What changes are included in this PR? The `LANG` is set to `C.UTF-8` in the dockerfile for this CI job ### Are these changes tested? The change only affects a test ### Are there any user-facing changes? No * GitHub Issue: apache#41145 Authored-by: Weston Pace <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
…n timezone (apache#45051) ### Rationale for this change If the timezone database is present on the system, but does not contain a timezone referenced in a ORC file, the ORC reader will crash with an uncaught C++ exception. This can happen for example on Ubuntu 24.04 where some timezone aliases have been removed from the main `tzdata` package to a `tzdata-legacy` package. If `tzdata-legacy` is not installed, trying to read a ORC file that references e.g. the "US/Pacific" timezone would crash. Here is a backtrace excerpt: ``` dremio#12 0x00007f1a3ce23a55 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6 dremio#13 0x00007f1a3ce39391 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6 dremio#14 0x00007f1a3f4accc4 in orc::loadTZDB(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 dremio#15 0x00007f1a3f4ad392 in std::call_once<orc::LazyTimezone::getImpl() const::{lambda()dremio#1}>(std::once_flag&, orc::LazyTimezone::getImpl() const::{lambda()dremio#1}&&)::{lambda()dremio#2}::_FUN() () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 dremio#16 0x00007f1a4298bec3 in __pthread_once_slow (once_control=0xa5ca7c8, init_routine=0x7f1a3ce69420 <__once_proxy>) at ./nptl/pthread_once.c:116 dremio#17 0x00007f1a3f4a9ad0 in orc::LazyTimezone::getEpoch() const () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 dremio#18 0x00007f1a3f4e76b1 in orc::TimestampColumnReader::TimestampColumnReader(orc::Type const&, orc::StripeStreams&, bool) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 dremio#19 0x00007f1a3f4e84ad in orc::buildReader(orc::Type const&, orc::StripeStreams&, bool, bool, bool) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 dremio#20 0x00007f1a3f4e8dd7 in orc::StructColumnReader::StructColumnReader(orc::Type const&, orc::StripeStreams&, bool, bool) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 dremio#21 0x00007f1a3f4e8532 in orc::buildReader(orc::Type const&, orc::StripeStreams&, bool, bool, bool) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 dremio#22 0x00007f1a3f4925e9 in orc::RowReaderImpl::startNextStripe() () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 dremio#23 0x00007f1a3f492c9d in orc::RowReaderImpl::next(orc::ColumnVectorBatch&) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 dremio#24 0x00007f1a3e6b251f in arrow::adapters::orc::ORCFileReader::Impl::ReadBatch(orc::RowReaderOptions const&, std::shared_ptr<arrow::Schema> const&, long) () from /tmp/arrow-HEAD.ArqTs/venv-wheel-3.12-manylinux_2_17_x86_64.manylinux2014_x86_64/lib/python3.12/site-packages/pyarrow/libarrow.so.1900 ``` ### What changes are included in this PR? Catch C++ exceptions when iterating ORC batches instead of letting them slip through. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#40633 Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>